Correcting for Optimistic Prediction in Small Data Sets
نویسندگان
چکیده
منابع مشابه
Correcting for Optimistic Prediction in Small Data Sets
The C statistic is a commonly reported measure of screening test performance. Optimistic estimation of the C statistic is a frequent problem because of overfitting of statistical models in small data sets, and methods exist to correct for this issue. However, many studies do not use such methods, and those that do correct for optimism use diverse methods, some of which are known to be biased. W...
متن کاملCorrecting MM estimates for "fat" data sets
Regression MM estimates require the estimation of the error scale, and the determination of a constant that controls the e¢ ciency. These two steps are based on asymptotic results that are derived assuming that the number of predictors p remains xed while the number of observations n tends to in nity, which means assuming that the ratio p=n is small. However, many high-dimensional data sets ...
متن کاملFeature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach
Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...
متن کاملKneser-Ney Smoothing With a Correcting Transformation for Small Data Sets
We present a technique which improves the Kneser–Ney smoothing algorithm on small data sets for bigrams, and we develop a numerical algorithm which computes the parameters for the heuristic formula with a correction. We give motivation for the formula with correction on a simple example. Using the same example, we show the possible difficulties one may run into with the numerical algorithm. App...
متن کاملLink Prediction in Highly Fractional Data Sets
Extremist organizations all over the world increasingly use online social networks as a communication media for recruitment and planning. As such, online social networks are also a source of information utilized by intelligence and counter terror organizations investigating the relationships between suspected individuals. Unfortunately, the data mined from open sources is usually far from being...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: American Journal of Epidemiology
سال: 2014
ISSN: 0002-9262,1476-6256
DOI: 10.1093/aje/kwu140